An Efficient Parsing Algorithm for Tree Adjoining Grammars

نویسنده

  • Karin Harbusch
چکیده

In the literature, Tree Adjoining Grammars (TAGs) are propagated to be adequate for natural language description-analysis as well as generation. In this paper we concentrate on the direction of analysis. Especially important for an implementation of that task is how efficiently this can be done, i.e., how readily the word problem can be solved for TAGs. Up to now, a parser with O(n 6) steps in the worst case was known where n is the length of the input string. In this paper, the result is improved to O(n 4 log n) as a new lowest upper bound. The paper demonstrates how local interpretion of TAG trees allows this reduction. 1 INTRODUCTION Compared with the formalism of context-free grammars (CFC, s), the rules of Tree Adjoining Grammars (TAGs) can be imagined intuitively as parts of context-free derivation trees. Without paying attention to the fact that there are some more restrictions for these rules, the recursion operation (adjoining) is represented as replacing a node in a TAG rule by another TAG rule so that larger derivation trees are built. This close relation between CFGs and TAGs can imply that they are equivalent. But TAGs are more powerful than context-free grammars. This additional power-characterized as mildly context-sensitive-leads to the question of whether there are efficient algorithms to solve the word problem for TAGs. Up to now, the algorithm of Vijay-Shanker and Joshi with a time complexity of O(n 6) for the worst case was known, in addition to several unsuccessful attempts to improve this result. This paper's main emphasis is on the improvement of this result. An efficient parser for Tree Adjoining Grammars with a worst case time complexity of O(n 4 log n) is discussed. All known parsing algorithms for TAGs use the close structural similarity between TAGs and CFGs, which can be expressed by writing all inner nodes and all their sons in a TAG as the rule set of a context-free grammar (the context-free ker-nelof a TAG). Additionally, the constraint has to be tested that all further context-free rules corresponding to the same TAG tree must appear in the derivation tree, iff one rule of that TAG tree is in use. Therefore, it is clear that a context-free parser can be the basis for extensions representing the test of the additional constraint. On the basis of the two fundamental context-free analysers, the different approaches for TAGs 284 can be …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tabulation of Automata for Tree-Adjoining Languages

We propose a modular design of tabular parsing algorithms for treeadjoining languages. The modularity is made possible by a separation of the parsing strategy from the mechanism of tabulation. The parsing strategy is expressed in terms of the construction of a nondeterministic automaton from a grammar; three distinct types of automaton will be discussed. The mechanism of tabulation leads to the...

متن کامل

Parsing Tree Adjoining Grammars and Tree Insertion Grammars with Simultaneous Adjunctions

A large part of wide coverage Tree Adjoining Grammars (TAG) is formed by trees that satisfy the restrictions imposed by Tree Insertion Grammars (TIG). This characteristic can be used to reduce the practical complexity of TAG parsing, applying the standard adjunction operation only in those cases in which the simpler cubic-time TIG adjunction cannot be applied. In this paper, we describe a parsi...

متن کامل

Lambek Grammars, Tree Adjoining Grammars and Hyperedge Replacement Grammars

Two recent extension of the nonassociative Lambek calculus, the LambekGrishin calculus and the multimodal Lambek calculus, are shown to generate class of languages as tree adjoining grammars, using (tree generating) hyperedge replacement grammars as an intermediate step. As a consequence both extensions are mildly context-sensitive formalisms and benefit from polynomial parsing algorithms.

متن کامل

Parsing Tree Adjoining Grammars With A Preprocessor

This paper presents a preprocessor based parsing system for Tree Adjoining Grammars. The preprocessor is used for two purposes: (1) to organize the data structures, (2) to reduce the runtime processing load so that the parser executes fast. A parallel parsing algorithm is presented that takes advantage of the preprocessor. The future goals of the proposed research are to achieve scalability and...

متن کامل

Improved Head-Corner Parsing of Tree-Adjoining Grammars

In this paper we present a chart-based head corner parsing algorithm for Tree Adjoining Grammars (TAGs). We then present an improved algorithm that has a O(n) bound for the number of head-corner predictions as compared to previous approaches which have a bound of O(n). The improvement is made possible by changing the de nition of headcorner traversal and moving some aspects of top-down predicti...

متن کامل

Fast LR parsing Using Rich (Tree Adjoining) Grammars

We describe an LR parser of parts-ofspeech (and punctuation labels) for Tree Adjoining Grammars (TAGs), that solves table conflicts in a greedy way, with limited amount of backtracking. We evaluate the parser using the Penn Treebank showing that the method yield very fast parsers with at least reasonable accuracy, confirming the intuition that LR parsing benefits from the use of rich grammars.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1990